An effective 3-in-1 keyword search method over heterogeneous data sources

نویسندگان

  • Guoliang Li
  • Jianhua Feng
  • Beng Chin Ooi
  • Jianyong Wang
  • Lizhu Zhou
چکیده

Conventional keyword search engines are restricted to a given data model and cannot easily adapt to unstructured, semi-structured or structured data. In this paper, we propose an efficient and adaptive keyword search method, called EASE, for indexing and querying large collections of heterogeneous data. To achieve high efficiency in processing keyword queries, we first model unstructured, semi-structured and structured data as graphs, and then summarize the graphs and construct graph indices instead of using traditional inverted indices. We propose an extended inverted index to facilitate keyword-based search, and present a novel ranking mechanism for enhancing search effectiveness. We have conducted an extensive experimental study using real datasets, and the results show that EASE achieves both high search efficiency and high accuracy, and outperforms the existing approaches significantly. & 2008 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Path-aware Approach for Keyword Search over Data Graphs

Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...

متن کامل

Query Reformulation for Keyword Searching in Mediator Systems

Integration of heterogeneous data sources is still an important task. Mediator systems are one approach to support a structured search over heterogeneous sources. These systems provide comprehensive query languages, which are very powerful but hard to use for inexperienced users. Therefore, easier query interfaces have to be developed. One well-known and effective interface is the keyword searc...

متن کامل

Ontology-Driven Keyword Search for Heterogeneous XML Data Sources

Massive heterogeneous XML data sources emerge on the Internet nowadays. These data sources are generally autonomous and provide search interfaces of XML query language such as XPath or XQuery. Accordingly, users need to learn complex syntaxes and know the schemas. Keyword Search is a user-friendly information discovery technique, which can assist users in obtaining useful information convenient...

متن کامل

Processing XML Keyword Search by Constructing Effective Structured Queries

Recently, keyword search has attracted a great deal of attention in XML database. It is hard to directly improve the relevancy of XML keyword search because lots of keyword-matched nodes may not contribute to the results. To address this challenge, in this paper we design an adaptive XML keyword search approach, called XBridge, that can derive the semantics of a keyword query and generate a set...

متن کامل

Keyword Join: Realizing Keyword Search for Information Integration

Information integration has been widely addressed over the last several decades. However, it is far from solved due to the complexity of resolving schema and data heterogeneities. In this paper, we propose out attempt to alleviate such difficulty by realizing keyword search functionality for integrating information from heterogeneous databases. Our solution does not require predefined global sc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Syst.

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2011